Density-based clustering of short-text corpora∗ Agupamiento de textos cortos basado en densidad
نویسندگان
چکیده
In this work, we analyse the performance of different density-based algorithms on short-text and narrow domain short-text corpora. We attempt to determine to what extent the features of this kind of corpora impact on the density computation of the clusterings obtained and how robust these algorithms to the different complexity levels are.
منابع مشابه
A Particle Swarm Optimizer to Cluster Parallel Spanish-English Short-text Corpora Un Optimizador basado en Cúmulo de Part́ıculas para el Agrupamiento de Textos Cortos de Colecciones Paralelas en Español-Inglés
Short-texts clustering is currently an important research area because of its applicability to web information retrieval, text summarization and text mining. These texts are often available in different languages and parallel multilingual corpora. Some previous works have demonstrated the effectiveness of a discrete Particle Swarm Optimizer algorithm, named CLUDIPSO, for clustering monolingual ...
متن کاملClustering Iterativo de Textos Cortos con Representaciones basadas en Conceptos
Resumen La tendencia actual a trabajar con documentos cortos (blogs, mensajes de textos, y otros), ha generado un interés creciente en las técnicas de procesamiento automáticas de documentos con estas caracteŕısticas. En este contexto, el “clustering” (agrupamiento) de textos cortos es un área muy importante de investigación, que puede jugar un rol fundamental en organizar estos grandes volúmen...
متن کاملPerformance analysis of Particle Swarm Optimization applied to unsupervised categorization of short texts Análisis de Prestación de Particle Swarm Optimization aplicado a Categorización no Supervisada de Textos Cortos
Nowadays there is a need to access to on line information such as abstracts, news, opinions, evaluations of products, etc. That information is generally available on the web as short texts. Previous works have demonstrated the effectiveness of a discrete Particle Swarm Optimization algorithm, named CLUDIPSO, for clustering small short-text corpora. This article presents a preliminary study abou...
متن کاملMinería de opiniones centrada en tópicos usando textos cortos en español
Users express their feelings about an entity of a specific topic in a free way using short texts on social networks. Sentiment analysis, also known as opinion mining, focuses on examining these texts to determine their polarity. This article presents an approach to the mining of opinions based on topics from Twitter texts in Spanish. The main objective is to decide the polarity of a text, deter...
متن کاملUn Análisis Comparativo de Estrategias para la Categorización Semántica de Textos Cortos
Nowadays, short-texts categorization is an important research area because most of the information we usually receive and work with have this characteristic (e-mails, text messages, news, etc.). Different studies have reported interesting results in text categorization by adding semantic information to documents’ representation. However, these studies have not focused on the particularities tha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008